What a Nerd! Beating Students and Vector Cosine in the ESL and TOEFL Datasets

نویسندگان

  • Enrico Santus
  • Alessandro Lenci
  • Tin-Shing Chiu
  • Qin Lu
  • Chu-Ren Huang
چکیده

In this paper, we claim that Vector Cosine – which is generally considered one of the most efficient unsupervised measures for identifying word similarity in Vector Space Models – can be outperformed by a completely unsupervised measure that evaluates the extent of the intersection among the most associated contexts of two target words, weighting such intersection according to the rank of the shared contexts in the dependency ranked lists. This claim comes from the hypothesis that similar words do not simply occur in similar contexts, but they share a larger portion of their most relevant contexts compared to other related words. To prove it, we describe and evaluate APSyn, a variant of Average Precision that – independently of the adopted parameters – outperforms the Vector Cosine and the co-occurrence on the ESL and TOEFL test sets. In the best setting, APSyn reaches 0.73 accuracy on the ESL dataset and 0.70 accuracy in the TOEFL dataset, beating therefore the non-English US college applicants (whose average, as reported in the literature, is 64.50%) and several state-of-the-art approaches.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Comparative Study of Class Activities and Students’ Expectations of IELTS and TOEFL iBT Preparation Courses: A Methodological Triangulation Washback Study

Washback refers to the influence of a test on teaching and learning. This study was an attempt to compare the influence of IELTS and TOEFL iBT on the expectations the students brought to their courses and to investigate how these expectations were fulfilled. To this end, 100 IELTS and 120 TOEFL iBT students attending preparation courses took a questionnaire survey, and a sample of their ten cla...

متن کامل

The Effect of Role-Play and Simulation Approach on Enhancing ESL Oral Communication Skills

This study investigated the effect of role-play and simulation approach on Malaysian Polytechnic engineering students’ ESL oral communication skills. In addition, the study examined the students’ perceptions of the effect of the role-play and simulation on their oral communication skills. A mixed method design was employed, using both quantitative and qualitative data collection app...

متن کامل

Motivation, amount of interaction, length of residence, and ESL learners’ pragmatic competence

This study examined how motivation for learning English, the amount of contact with English, and  length  of  residence  in  the  target language area affects Korean graduate students’ English pragmatic skills. The study attempted to account for differential pragmatic development among 50  graduate-level  Korean  students  in  relation  to  individual  factors  mentioned  above.  The  data were...

متن کامل

Mining the Web for Synonyms: PMI-IR versus LSA on TOEFL

This paper presents a simple unsupervised learning algorithm for recognizing synonyms, based on statistical data acquired by querying a Web search engine. The algorithm, called PMI-IR, uses Pointwise Mutual Information (PMI) and Information Retrieval (IR) to measure the similarity of pairs of words. PMI-IR is empirically evaluated using 80 synonym test questions from the Test of English as a Fo...

متن کامل

Towards a Reappraisal of Literary Competence within the Confines of ESL/EFL Classroom

The present paper aimed at highlighting the judicious incorporation of literary genres (i.e. novel, short story/fiction, drama, and poetry) as a supposedly inspiring teaching technique and an allegedly potent learning resource into ESL/EFL curricula. The rationale behind this pedagogical inclusion is to promote both teaching and learning effectiveness through capitalizing intensively on the gen...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • CoRR

دوره abs/1603.08701  شماره 

صفحات  -

تاریخ انتشار 2016